In this study, a consistency analysis of energy parameter for Mandarin speech is presented. Identified as a result of\r\ninspection of the human pronunciation process, the consistency can be interpreted as a high correlation of a\r\nwarping curve between the spectrum and the prosody intra a syllable. Through three steps in the procedure of the\r\nconsistency analysis, the hidden Markov model (HMM) algorithm is used first to decode HMM-state sequences\r\nwithin a syllable at the same time as to divide them into three segments. Second, based on a designated syllable,\r\nthe vector quantization (VQ) with the Lindeââ?¬â??Buzoââ?¬â??Gray algorithm is used to train the VQ codebooks of each\r\nsegment. Third, the energy vector of each segment is encoded as an index by VQ codebooks, and then the\r\nprobability of each possible path is evaluated as a prerequisite to analyze the consistency. It is demonstrated\r\nexperimentally that a consistency is definitely acquired in case the syllable is located exactly in the same word.\r\nThese results offer a research direction that the energy warping process intra a syllable must be considered in a\r\ntext-to-speech system to improve the synthesized speech quality.
Loading....